Research Interests – Alin Dobra

نویسندگان

  • Alin Dobra
  • Minos Garofalakis
چکیده

The recent explosion of the Internet and the rapid technological advances in gathering and storing information have resulted in huge amounts of data being collected at a very rapid rate. Developing ways to extract relevant information from such large amounts of data in a human comprehensible form and at the same time in a timely and cost effective way is of great practical importance. Approximate Query Processing and Data-mining, both concerned with extracting useful knowledge from large amounts of data but using different premises, have been the subject of my research as a PhD candidate. My work, so far, is in most part of theoretical nature but the problems I attacked have direct practical applicability. For me theory is only a tool, albeit a very effective one for gaining interesting insights into the problem, but not an end in itself; I have always accompanied it by implementation and empirical validation. In what follows I will describe in some detail my particular interests in these two areas, pointing out past, current and future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The DBO Database System (598)

We demonstrate our prototype of the DBO database system. DBO is designed to facilitate scalable analytic processing over large data archives. DBO’s analytic processing performance is competitive with other database systems; however, unlike any other existing research or industrial system, DBO maintains a statistically meaningful guess to the final answer to a query from start to finish during q...

متن کامل

Probabilistic Characterization of Decision Trees Probabilistic Characterization of Decision Trees

In this paper we use the methodology introduced in Dhurandhar and Dobra (2006) for analyzing the error of classifiers and the model selection measures, to analyze decision tree algorithms. The methodology consists of obtaining parametric expressions for the moments of the Generalization error (GE) for the classification model of interest, followed by plotting these expressions for interpritabil...

متن کامل

Insights into Cross-validation

Cross-validation is one of the most widely used techniques, in estimating the Generalization Error of classification algorithms. Though several empirical studies have been conducted, to study the behavior of this method in the past, none of them clearly elucidate the reasons behind the observed behavior. In this paper we study the behavior of the moments (i.e. expected value and variance) of th...

متن کامل

Distribution free bounds for relational classi cation

Statistical Relational Learning (SRL) is a sub-area in Machine Learning which addresses the problem of performing statistical inference on data that is correlated and not independently and identically distributed (i.i.d.) { as is generally assumed. For the traditional i.i.d. setting, distribution free bounds exist, such as the Hoe ding bound, which are used to provide con dence bounds on the ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003